Mellin Transforms and Asymptotics The Mergesort Recurrence
نویسنده
چکیده
Mellin transforms and Dirichlet series are useful in quantifying periodicity phe nomena present in recursive divide and conquer algorithms This note illustrates the tech niques by providing a precise analysis of the standard top down recursive mergesort algo rithm in the average case as well as in the worst and best cases It also derives the variance and shows that the cost of mergesort has a Gaussian limiting distribution The approach is applicable to a number of divide and conquer recurrences Many algorithms are based on a recursive divide and conquer strategy of splitting a problem into two subproblems of equal or almost equal size separately solving the subproblems and then knitting their solutions together to nd the solution to the original problem Accordingly their complexity is expressed by recurrences of the usual divide and conquer form fn fbn c fdn e en where the initial condition f and the knitting costs en depend on the problem being stud ied Typical examples are mergesort heapsort Karatsuba s multiprecision multiplication discrete Fourier transforms binomial queues sorting networks etc It is relatively easy to determine gen eral orders of growth for solutions to these recurrences as explained in standard texts see the master theorem of p However a precise asymptotic analysis is often appreciably more delicate At a more detailed level divide and conquer recurrences tend to have solutions that involve periodicities many of which are of a fractal nature It is our purpose here to discuss the analysis of such periodicity phenomena while focussing on the analysis of the standard top down recursive mergesort algorithm For example as we shall soon see the average cost of running mergesort on n keys satis es U n n lgn nB lg n o n where B x is a fractal like periodic function Similarly the variance of the cost of mergesort is V n nC lgn o n where C x is also a fractal like periodic function The methods employed Mellin transforms Dirichlet series Perron s formula borrow from classical analytic number theory Related problems with emphasis on digit sums and exact summatory formul are discussed in In section we quickly review the mergesort algorithm and derive the equations that describe its behavior In section we introduce the analytic tools that we will use and then utilize them to develop a general technique for deriving precise asymptotics of divide and conquer recurrences In section we apply this general technique to quickly re derive the already known worst and best case costs of mergesort In section we apply the technique to derive the average case cost of mergesort In section we discuss the actual distribution of the cost of mergesort analyzing its variance and proving that it has a Gaussian limiting distribution We conclude in section by brie y sketching some other possible uses of our general technique In what follows we set lgn log n and use the standard notation for fractional parts fug u buc A preliminary version of this paper has been presented at the th ICALP Conference Mergesort Mergesort Fig and see p for a fuller description sorts a le of n elements by a splitting it into two parts of sizes bn c and dn e respectively b recursively mergesorting the two sub les and then c merging the two sorted sub les together The recursion terminates when n because a le with one element is already sorted The cost in number of comparisons performed by mergesort satis es the canonical divide and conquer recurrence fn fbn c fdn e en n f where the actual values of the en the costs of the merges depend upon whether it is the worst best or average case that is being analyzed For a better understanding of the values of the en we require a deeper understanding of the mechanics of the merging procedure itself Suppose A a a and B b b are two lists of numbers both already sorted in nondecreasing order The procedure Merge A B D as described in Figure merges the two sorted lists to form a new sorted list C c c whose elements are those of A B and then copies this list into list D It does this by comparing the largest element in A to the largest element in B removing the maximum of the two from the list in which it is located and placing it in C It then compares the largest remaining element in A to the largest remaining element in B and again removes the maximum this time inserting it into the second largest spot in C It continues this process of comparing the largest remaining elements in the two lists against each other removing the maximum and concatenating it to the back of Algorithm MergeSort a n if n then f MergeSort a bn c MergeSort a bn c n Merge a bn c a bn c n a n g Figure Top Down Recursive Mergesort Procedure Merge A B D Merges sorted lists A and B into list C and then copies the result into list D size A size B while and do Compare largest elements in each list if A B then fC A g else fC B g Move contents of nonempty list over to C S if then for i S downto do C i A i else for i S downto do C i B i for i to do D i C i Copy C into D Figure The Merging Procedure It works by successively comparing the largest remaining elements in A and B C until one of the two lists is empty It then moves all of the elements from the non empty list over to the back of C Since the elements remaining on the non empty list are all smaller than the ones that have already been moved and also are all already in sorted order moving them over to C requires no further comparisons In fact if a list based as opposed to an array based merge is used we can move all of the remaining items over to C simply by changing the address in one pointer The cost in number of comparisons of merging a size list with a size one what we call a merge is S where S is the number of elements left on the non empty list at the end of the procedure We now proceed with the analysis of mergesort The top level merge performed by the al gorithm is a bn c dn e one In the worst case S so T n the worst case behavior of mergesort satis es T n T b c T d e n n T The best case of a merge occurs when all of the items in the larger le are bigger than all of the items in the smaller one and S max The best case of the bn c dn e merge then uses n dn e bn c comparisons so Y n the best case behavior of mergesort sati es Y n Y b c Y d e b c n Y This occurs for example when the numbers n are in the le in inverted order The average case is much more interesting Our derivation follows that of p To study the average case we assume that the elements in A B are the integers and further that each of the possible partitions of the numbers into sorted lists A and B are equally likely Recall that S is the number of items left in the nonempty list by the merging procedure these items are the S smallest items in A B Thus S s if and only if one of the two
منابع مشابه
Exact Asymptotics of Divide-and-Conquer Recurrences
The divide-and-conquer principle is a majoi paradigm of algorithms design. Corresponding cost functions satisfy recurrences that directly reflect the decomposition mechanism used in the algorithm. This work shows that periodicity phenomena, often of a fractal nature, are ubiquitous in the performances of these algorithms. Mellin transforms and Dirichlet series are used to attain precise asympto...
متن کاملLimit theorems for mergesort
Central and local limit theorems (including large deviations) are established for the number of comparisons used by the standard top-down recursive mergesort under the uniform permutation model. The method of proof utilizes Dirichlet series, Mellin transforms and standard analytic methods in probability theory.
متن کاملSome Applications of the Mellin Transform to Asymptotics of Series
Mellin transforms are used to find asymptotic approximations for functions defined by series. Such approximations were needed in the analysis of a water-wave problem, namely, the trapping of waves by submerged plates. The method seems to have wider applicability.
متن کاملMellin Transforms and Asymptotics: Finite Differences and Rice's Integrals
High order differences of simple number sequences may be analysed asymptotically by means of integral representations, residue calculus, and contour integration. This technique, akin to Mellin transform asymptotics, is put in perspective and illustrated by means of several examples related to combinatorics and the analysis of algorithms like digital tries, digital search trees, quadtrees, and d...
متن کاملMellin Transforms and Asymptotics: Digital Sums
Flajolet, Ph., P. Grabner, P. Kirschenhofer, H. Prodinger and R.F. Tichy, Mellin transforms and asymptotics: digital sums, Theoretical Computer Science 123 (1994) 291-314. Arithmetic functions related to number representation systems exhibit various periodicity phenomena. For instance, a well-known theorem of Delange expresses the total number of ones in the binary representations of the first ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1993